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Abstract — The closest vector problem (CVP) and shortest 
(nonzero) vector problem (SVP) are the core algorithmic prob- 
lems on Euclidean lattices. They are central to the applications of 
lattices in many problems of communications and cryptography. 
Kannan's embedding technique is a powerful technique for solving 
the approximate CVP, yet its remarkable practical performance 
is not well understood. In this paper, the embedding technique is 
analyzed from a bounded distance decoding (BDD) viewpoint. We 
present two complementary analyses of the embedding technique: 
We establish a reduction from BDD to Hermite SVP (via unique 
SVP), which can be used along with any Hermite SVP solver 
(including, among others, the Lenstra, Lenstra and Lovasz (LLL) 
algorithm), and show that, in the special case of LLL, it performs 
at least as well as Babai's nearest plane algorithm (LLL-aided 
SIC). The former analysis helps to explain the folklore practical 
observation that unique SVP is easier than standard approximate 
SVP. It is proven that when the LLL algorithm is employed, 
the embedding technique can solve the CVP provided that the 
noise norm is smaller than a decoding radius Ai/(27), where 
Ai is the minimum distance of the lattice, and 7 ^ 0{2"''^). 
This substantially improves the previously best known correct 
decoding bound 7 ^ 0(2"). Focusing on the applications of BDD 
to decoding of multiple-input multiple-output (MIMO) systems, 
we also prove that BDD of the regularized lattice is optimal 
in terms of the diversity-multiplexing gain tradeoff (DMT), and 
propose practical variants of embedding decoding which require 
no knowledge of the minimum distance of the lattice and/or 
further improve the error performance. 

Index Terms — closest vector problem, lattice decoding, lattice 
reduction, MIMO systems, shortest vector problem 



L Introduction 

Lattice decoding for the linear multiple-input multiple- 
output (MIMO) channel is a problem of high relevance in 
multi-antenna, broadcast, multi-access, cooperative and other 
multi-terminal communication systems IHllllltl- Maximum- 
likelihood (ML) decoding for finite constellations carved from 
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lattices can be realized efficiently by sphere decoding ||4[], 
whose complexity can however grow prohibitively with the 
dimension n |5j]. The decoding complexity is especially high 
in the case of coded or distributed systems, where the lat- 
tice dimension is usually larger @ [t]]- Thus, the practical 
implementation of decoders often has to resort to approximate 
solutions, which mostly fall under two main strategies. The 
first is to reduce the complexity of sphere decoding, notably 
by pruning Jst]. Another approach, which we investigate in 
the present paper, is lattice reduction (LR)-aided decoding Igt], 
which was used earlier by Babai in llioll and in essence applies 
zero-forcing (ZF), successive interference cancellation (SIC) 
or other suboptimal receivers to a reduced basis of the lattice. 
It was shown in (v^ that regularized lattice-reduction aided 
decoding can achieve the optimal diversity and multiplexing 
tradeoff (DMT) in MIMO fading channels. The proximity 
factors that measure the gap between lattice-reduction-aided 
decoding and (infinite) lattice decoding were derived in 1112 1. 
Thanks to its average polynomial complexity lll 3 [ | l4lll5ll . the 
Lenstra, Lenstra and Lovasz (LLL) reduction 1II6II is widely 
used in lattice decoding. 

However, the analysis in fl2ll revealed that lattice-reduction- 
aided decoding exhibits a widening gap to (infinite) lattice 
decoding, so there is a strong demand for computationally 
efficient suboptimal decoding algorithms that offer improved 
performance. Several such approaches are emerging, including 
sampling 1 17] and embedding 1 18|. It was shown in UtIi that 
the sampling technique can provide a constant improvement 
to the best known upper bound for the signal-to-noise ratio 
(SNR) gain with polynomial complexity. 

Embedding decoding is especially appealing due to 
its excellent performance and polynomial complexity (if 
polynomial-complexity lattice reduction algorithms such as 
LLL reduction are used). The core of the embedding technique 
is to embed an n-dimensional lattice and the received vector 
into an (n + 1) -dimensional lattice. By this means, an n- 
dimensional instance of the closest vector problem (CVP) is 
converted into an (n + 1) -dimensional instance of the shortest 
(nonzero) vector problem (SVP). The receiver extracts the 
transmitted vector from a reduced basis of the extended lattice. 

An "improved lattice reduction" technique that resembles 
embedding was used for MIMO decoding in lil9ll . but it is in 
fact equivalent to LLL-aided SIC. It was recognized in llisll 
that the performance of the embedding technique could be 
significantly improved by carefully choosing the embedding 
parameter, leading to "augmented lattice reduction" (ALR). 
In particular, it was shown lIlSll that the LLL algorithm can 
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recover the transmitted vector when the noise norm is small 
compared to the minimum distance Ai of the lattice. This 
condition corresponds to a variant of the CVP known as 
Bounded Distance Decoding (BDD). More precisely, 77-BDD 
(with r/ < 1/2) is a special instance of the CVP where the 
noise norm (or, equivalently, the distance from the target vector 
to the lattice) is less than R = ?7 • Ai. The radius R is 
referred to as the (correct) decoding radius of the algorithm. 
BDD instances appear both in coding and in cryptography. 
In coding theory, BDD is a suboptimal decoding strategy 
that enjoys lower complexity compared to ML decoding. For 
specific algebraic codes and for specific lattice codes, there are 
numerous BDD algorithms that achieve optimal 77 = 1/2 in 
polynomial-time |23, 21. 22]. On the other hand, for general 
lattices, polynomial complexity algorithms only solve r^-BDD 
for much smaller values of t]. The main general-purpose 
approaches include: Babai's ZF iH; Babai's SIC Ol; and 
the randomized extensions by Klein lEsll . Lindner and Peik- 
ert H, and Liu et al. lEll . 

In cryptography, the observed hardness of BDD has been 
used as a constructive tool. The so-called Learning With 
Errors (LWE) problem 1I26I1 (see also the survey iIztI ) can 
be interpreted as a variant of BDD where the lattice is chosen 
uniformly in a specific family of lattices and the noise vector 
follows a Gaussian distribution with small standard deviation. 
The apparent hardness of LWE in high dimensions has been 
exploited to devise a number of cryptographic protocols, 
including encryption 1E6I1 . identification |28| and signature 
schemes 1291]. 

The embedding technique is a powerful approach to BDD 
for general lattices. Kannan seems to have been the first to 
propose this technique llsoll . Since then, Micciancio has used 
it to reduce the CVP to the SVP to prove certain hardness 
results llsill . while Nguyen has employed it to break the GGH 
cryptosystem for parameters of practical interest [l32l . More 
recently, in the context of cryptography, Lyubashevsky and 
Micciancio revealed a relationship between BDD and variants 
of SVP II33II . Of particular relevance to this paper is the 
relationship between BDD and unique SVP (uSVP), a special 
instance of SVP for lattices whose second minimum is at 
least 7 times longer than the first minimum. It was shown 
in ^ that l/(27)-BDD can be reduced to 7-uSVP This 
relation suggests the following strategy, already used in lIlSll : 
the embedding parameter should be chosen in such a way that 
the extended lattice exhibits an exponential gap between the 
first and second minimum, ensuring that LLL-reducing the 
extended lattice basis successfully solves the uSVP instance. 



Contributions: Our contributions are twofold: We improve 
the theoretical analysis of the embedding technique, and we 
consider questions raised by the specific application of BDD 
and embedding to communications. 

On the analysis front, we prove that embedding decoding 
using the LLL algorithm can solve 1/(27) -BDD for 7 ~ 
0{\/n2T). This is significantly better than the bound 7 = 
0(2") proven in [[Tsll . We propose two complementary proofs 
for this result. In the first approach, we establish a reduction 
from the unique SVP to the Hermite SVP, which consists 



in finding a non-zero vector of a given lattice, of small 
norm relative to the root determinant. This analysis can be 
specialized to LLL by showing that the LLL algorithm can 
solve 7-uSVP for 7 0(2t). This is stronger than the 
commonly used bound 7 = 0{2^) in literature, which in fact 
pertains to approximate SVP. The second approach consists in 
showing Babai's SIC achieves this correct decoding radius (by 
improving the bound in 11211 ') and then proving that Kannan's 
embedding with LLL performs at least as well as Babai's SIC. 
For the latter component of this proof, we proceed by explic- 
itly following the steps performed by Kannan's embedding. 
The two proofs are of independent interest. The first is not 
restricted to LLL but is suited to any algorithm solving the 
Hermite SVP, while the second provides a precise description 
of how the embedding technique works. 

The reduction from the unique SVP to the Hermite SVP 
helps to explain the long-standing problem why unique SVP is 
easier than standard approximate SVP. It has been known that 
uSVP is potentially easier, and there has been experimental 
evidence that this is indeed the case in practice ]f34|. However, 
no theoretic justification has been given before. 

On the MIMO communications front, we prove that BDD 
of the regularized lattice is DMT-optimal over Rayleigh fading 
channels. This represents a nontrivial extension of the analysis 
in il in for 7 -approximation algorithms of CVP. Indeed, it will 
be shown that 7-approximate algorithms are a special case of 
BDD, because any decoding technique which provides a 7- 
approximate CVP solution is also able to solve l/(27)-BDD. 
However, the converse is not necessarily true. In addition to 
embedding decoding, this result allows us to establish the 
DMT optimality of other BDD algorithms, such as lattice 
reduction-aided decoding and sampling decoding. 

For practical purposes, we consider the problem of choosing 
the main parameter involved in Kannan's embedding method, 
which we refer to as the embedding parameter. We give 
an alternative embedding parameter that only assumes the 
knowledge of Ai while achieving the same decoding radius 
as ifist] . We also consider the case when Ai is not known, 
and show that using multiple calls to this embedding decoder 
with an estimate of Ai achieves essentially the same decoding 
radius as if Ai were known. On the experimental side, we pro- 
pose variants of the embedding technique without knowledge 
of Al and/or with improved performance and compare them 
with state-of-the-art MIMO decoding techniques by numerical 
simulations in terms of error performance and complexity, 
showing that embedding is nearly optimal in many practical 
scenarios. 

The paper is organized as follows: Section II presents the 
transmission model and a short survey of lattice problems. The 
DMT analysis on BDD is given in Section III. In Section IV, 
we give the two analyses of the decoding radius of the 
embedding technique for solving BDD. In Section V, variants 
of the embedding decoder are presented. Section VI evaluates 
the performance by computer simulation. Some concluding 
remarks are offered in Section VII. 

Notation: Matrices and column vectors are denoted by 
upper and lowercase boldface letters, and the transpose, in- 



3 



verse, pseudoinverse of a matrix B by B^, B~^, and B^, 
respectively. I„ is the identity matrix of size n. We let b^, 
bi,j and bi respectively denote the i-th column of matrix B, 
the entry in the i-th row and j-th column of B, and the 
i-th entry in vector b. Vec(B) stands for the column-by- 
column vectorization of the matrix B. The inner product in 
the Euclidean space between vectors u and v is defined as 
(u, v) = u^v, and the Euclidean norm ||u|| = y^uTu). 
Kronecker product of matrix A and B is written as A (g) B. 
If X is a real number, we let \x\ denote its rounding to a 
closest integer The 3? and 3 prefixes denote the real and 
imaginary parts. We use the standard asymptotic notation 
/ (x) = O {g {x)) when limsup^_,^ \f{x)/g{x)\ < oo. 

II. Lattice Problems in MIMO Decoding 

A. System Model 

Consider an ut x nu flat-fading MIMO system model 
consisting of ut transmitters and nji receivers 

Y = HX + N, (1) 

where X € C"^^"^, Y, N e C""^"^ of block length T denote 
the channel input, output and noise, respectively, and H e 
(^riRxriT jg fjjg X full-rank channel gain matrix with 
nR > riT, whose entries are normalized to unit variance. The 
entries of N are i.i.d. complex Gaussian with variance 
each. The codewords X satisfy the average power constraint 
iJ[||X||p/r] = 1. Hence, the signal-to-noise ratio (SNR) at 
each receive antenna is 

When a lattice space-time block code is employed, the 
QAM information vector x is multiplied by the generator 
matrix G of the encoding lattice. The x T codeword 
matrix X is defined by column-wise stacking of consecutive 
n^-tuples of the vector s = Gx G C"^"^. By column-by- 
column vectorization of the matrices Y and N in ([T]i, i.e., 
y = Vec(Y) and n = Vec(N), the received signal at the 
destination can be expressed as 



y = (It «) H) Gx + n. 



(2) 
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When T = 1 and G = I„y, equation (|2]i reduces to the 
model for uncoded MIMO communication y = Hx + n. 
Furthermore, by separating real and imaginary parts, we obtain 
the equivalent 2nT x 2nj^ real-valued model 



(3) 



An equivalent 2nTT x 2njiT real model for coded MIMO can 
also be obtained in a similar way. 

The QAM constellations C can be interpreted as the shifted 
and scaled version of a finite subset A"'^ of the integer 
lattice Z"^, i.e., C = a{A"'^ + [1/2, 1/2]"^), where the fac- 
tor a arises from energy normalization. For example, we have 
A"^ = {-VM/2, VM/2 - 1} for M-QAM signalling. 

Therefore, with scaling and shifting, we consider the 
generic nxm (with m > n) real-valued MIMO system model 



Bx + n, 



(4) 



where B e M'"^", given by the real- valued equivalent of 
{It <E) H) G, can be interpreted as the basis matrix of the 



decoding lattice. We have n = 2nTT and m = 2nRT. The 
data vector x thus belongs to a finite subset A'"' C Z" which 
satisfies the average power constraint. 

The maximum-likelihood (ML) decoder computes 



arg mm 
xeA" 



Bxll 



(5) 



The ML solution ^ can be found using the sphere decoding 
algorithm, whose complexity, however, grows exponentially 
with n jsl. 

A suboptimal alternative technique called naive lattice de- 
coding (or simply lattice decoding) consists in relaxing the 
constraint due to the signal constellation as follows; 



arg mm |]y 



Bj 



A low-complexity approximation of lattice decoding is suc- 
cessive interference cancellation (SIC), also known as Babai's 
nearest plane algorithm ifioll . It consists in performing the QR 
decomposition B = QR, where Q has orthonormal columns 
and R is an upper triangular matrix with nonnegative diagonal 
elements Issll . Multiplying (|4|i on the left by Q^, we have 



/ = qV = Rx 



n 



(6) 



An estimate of x is then found by component-wise back- 
substitution and rounding; 



y'n 



.1. 



B. Lattice Basics 



We refer the reader to 113 61 13711 for thorough introductions 
to Euclidean lattices. An n-dimensional lattice in the m- 
dimensional Euclidean space M™ {n < m) is the set of 
integer linear combinations of n linearly independent vectors 
bi, 




G Z, i = 1, 



The matrix B = [bi • • • b„] is referred to as a basis of the lat- 
tice C = CCB). In matrix form, we have C = {Bx|x G Z"}. 
The dual lattice C* is defined as the set of those vectors 
u, such that the inner product (u, v) belongs to Z, for 
all V e £. The dual basis of B, which is a basis of the 
dual lattice £*, is given by B* = (B^^)-^J, where J is the 
column-reversing matrix. If R and R* respectively denote the 
R-factors of the QR-decomposition of B and B*, then we 
have n,i = „_,,^i for aU i jsij. 

The determinant det C = \/det(B^B) is independent of 
the choice of the basis. A shortest vector of a lattice £ is a 
non-zero vector in C with the smallest Euclidean norm. The 
norm of any shortest vector of C, often referred to as the 
minimum distance, is denoted by Ai(£) or Ai(B) when a 
basis B is given. We also let it be denoted by Ai if there is 
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no ambiguity concerning the lattice. The Hermite constant is 
defined as 

where the supremum is taken over all lattices of dimension n. 
There is currently no proof that 7„ is an increasing function 
of n, although this is very likely to be the case and all known 
bounds on 7„ are increasing. For the sake of convenience, we 
define 7^ = max"^]^7,;. By Minkowski's theorem Isoll . we 
have the bound 7„ < n and accordingly 7„ < n. 

Let B (0,r) denote the closed ball of radius r centered 
at 0. The notion of minimum distance can be generalized, by 
defining the i-th successive minimum \i{£.) (or if there is 
no ambiguity on the lattice) as the smallest radius r such that 
B (0,r) contains at least i linearly independent lattice points. 
If A2 > 7A1 (with 7 > 1), we say that the shortest vector in 
the lattice is ^-unique. 

For any point y gR™, the distance of y to the lattice C is 
denoted by dist(y,B) = minxsZ" |ly - Bx||. 

C. Lattice Problems 

We now give precise definitions of the lattice problems that 
are central to this work. In all these problems, the input lattice 
C is described by an arbitrary basis B. 

• Closest Vector Problem (CVP): 

Given a lattice C and a vector y S M™, find a vector Bx G 
£ such that j|y — Bxj| is minimal. 

• ^-Approximate CVP ("f-CVP), with 7 > 1.- 

Given a lattice C and a vector y G R™, find a vector Bx G 
£ such that ||y - Bx|| < 7dist(y, B). 

• rj-Bounded Distance Decoding (rj-BDD) with t] < 1/2: 
Given a lattice C and a vector y such that dist(y, B) < 7]Ai, 
find the lattice vector Bx G C (B) closest to y. 

• Shortest Vector Problem (SVP): 

Given a lattice C, find a vector v G £ of norm Ai. 

• ^-Approximate SVP ("/-SVP), with 7 > I.- 

Given a lattice £, find a vector v G £ such that < ||v|| < 
7A1. 

• Hermite SVP (C-HSVP), with C > 1.- 

Given a lattice £, find a vector v G £ such that < ||v|| < 
Cdeti/"(£). 

• "/-unique SVP (j-uSVP), with 7 > 1.- 

Given a lattice £ such that A2(£) > 7Ai(£), find a vector 
V G £ of norm Ai. 

D. LLL Reduction 

A lattice of dimension n > 2 has infinitely many bases. 
In general, if B is a full column rank matrix, then every 
matrix B = BU is also a basis of £ (B), when U is a 
unimodular matrix, i.e., det(U) = ±1 and all elements of U 
are integers. The aim of lattice reduction is to find a "good" 



basis for a given lattice. The celebrated LLL algorithm III6 I 
was the first polynomial-time algorithm that computes a vector 
not much longer than the shortest nonzero vector 

For the sake of simplicity of notation, in this paper we 
consider the version of the LLL algorithm based on the QR 
decomposition ll40ll B ~ QR. Let be the columns of Q, 
and rij be the elements of R. Note that ri 1 ~ ||bi||. 

The basis B is called LLL-reduced if 



< 



1 



for I < j < i < n, and 
Sri 



-1,1- 



-1 < rt. 



(8) 



(9) 



for 1 < i < ri, where l/4<(5<lisa factor selected to 
achieve a good quality-complexity tradeoff. 

Let = 1/(5 — 1/4). From Equations ^ and an LLL- 
reduced basis satisfies the following property: 



. > q;-V^ 



The latter implies the following bounds (see 01611 ): 



|bi|| < a("-i)/W/"£, 



< 



(10) 

(11) 

(12) 



Equation (HI]) means that LLL solves C-HSVP with C = 
Q,("-i)/4^ whereas Equation ( fT2] i implies that LLL solves 
both 7-SVP and 7-uSVP for 7 = a^"-'^'>/^. 

Remark 1: As this is the historical choice of iflill and as 
it simpHfies ( fTOl ). ( fTTT ) and ( fT2] i. one often sets i5 = 3/4 and 
consequently a — 2. 

Remark 2: The complex LLL algorithm from 1I41I1 handles 
a complex-valued lattice directly (without expanding it into 
a real-valued lattice). It delivers a similar quality guarantee 
with a ~ 1/ ((5 — 1/2). The results of the present work can 
be readily extended to complex LLL. 

E. Lattice Reduction-Aided Decoding 

In order to improve the performance of conventional de- 
coders (ZF or SIC), lattice reduction can be used to preprocess 
the channel matrix B. Since the reduced channel matrix is 
much more likely to be well-conditioned, the effect of noise 
amplification upon inverting the system will be moderated. 
The channel model (|4|i can be rewritten as 



BU 



Bx' 



U-^x, 



(13) 



where B = BU and U is an unimodular matrix. The ZF 
or SIC estimate x' for the equivalent channel (fT3] l is then 
transformed back into x = Ux'. As the resulting estimate x is 
not necessarily in A^, remapping of x onto the finite lattice A" 
is required. 



The correct decoding radius of SIC is given by 1112 1 



ic 



1 . I 

— mm r, 

2 l<i<n ' 



(14) 



which means that correct decoding is guaranteed if ||n|| < 
i?sic- Note that this bound is tight. If the basis is LLL-reduced, 
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the gap between min|ri.i| and Ai is bounded. Therefore, 
Babai's nearest plane algorithm ifioll . or LLL-SIC, can be 
viewed as a basic BDD solver By using ( fTOl ) and the fact 
that ||bi|| > Ai, it can be proved that LLL-SIC solves ?7-BDD 



Q, v ^11^ (see III2II ). Consequently, the decoding 



for 7] = 

radius of LLL-SIC is bounded as follows 

1 



^LLL-SIC > 



2Q,(n-l)/2 



Ai. 



(15) 



F. Improved bound on the decoding radius of LLL-SIC 

Here, we derive an improved bound on the decoding radius, 
which is better than (fTsT l. To the best of our knowledge, this 
is a new result of independent interest. 

Lemma 1: The decoding radius of LLL-SIC satisfies 

1 . . 1 



R 



LLL-SIC 



> 



Ai > 



Ai. (16) 



Proof: Suppose the basis B is LLL-reduced. Let B = 
QR denote its QR-decomposition. By using (|7| and (fTOl i. we 
obtain 



Ai < 



/7i:(dct£[bi, • • • ,hk 



< 



l/k 



(k-i)/2 



(17) 



Now, by using ( fT4l i and the above, we have 
-Rlll-sic = 7; min {rk,k} 

2 l<k<n 

1 



> min — , , , , , , 
l<k<n 2^f^oiS^~^il^ 



Ai 



This completes the proof. 



(18) 

□ 



III. DMT ANALYSIS OF BDD 



In this section we will prove that any decoding technique for 
MIMO systems which provides a solution to r^-BDD for some 
constant rj is optimal from the point of view of the diversity- 
multiplexing gain tradeoff or DMT 14211 . when a suitable left 
preprocessing is employed. 

In the present discussion, we suppose for the sake of 
simplicity that m = n. Following the notation in ll ill , we 
consider the equivalent normalized channel model where the 
noise variance is equal to 1: 

y' = B'x + n', 

where B' = ^B, n[ = ~ 7V(0, 1), Vi = 

Here p = 4y denotes the SNR. Moreover, we consider the 

equivalent regularized system 



yi = Rx + ni, 



(19) 



where 



= QR, yi = Qt 



0. 



From the point of view of receiver architecture, this amounts 
to performing left preprocessing before decoding, by using 
a maximum mean square error generalized decision-feedback 
equaUzer (MMSE-GDFE) El. Note that I„ may be replaced 
with any positive definite matrix T without hurting DMT 
optimality. 

In UJll, it was shown that when finite constellations are 
used, naive lattice decoding of the regularized system ( fT9] l 
without taking the constellation bounds into account is DMT- 
optimal. Moreover, it was proven that any decoding technique 
that always provides a solution to 7-CVP in the regularized 
lattice for some 7 is also DMT-optimal. Since LLL-SIC and 
LLL-ZF decoding applied to the regularized system turn out 
to solve 7-CVP, this means that MMSE-GDFE preprocessing 
followed by lattice-reduction aided decoding is also DMT- 
optimal. 

It is easy to see that any decoding technique which provides 
a 7-CVP solution x to ( fT9l l is also able to solve ^-BDD. In 
fact, suppose that yi is such that dist(yi,R) < ^Ai(R). 
Then the 7-CVP solution x satisfies 

||yi - Rx||<7 min ||yi - Rx|| = 7dist(yi,R) < 



Ai(R) 



so that X is the optimal solution of ( fT9] l. 

However, the converse is apparently not true, that is, BDD 
does not necessarily provide 7-CVP solutions for all yi. 
Therefore, the analysis in 111 ill does not extend in a straightfor- 
ward manner to BDD of the regularized lattice. Nevertheless, 
we can show that DMT-optimality holds for all instances 
of BDD, and not only for 7-CVP, by following the same 
reasoning of the original proof in ill ill . 



Theorem 1: For any constant jy > 0, any decoding tech- 
nique which always provides a solution for the regularized 
77-BDD is DMT-optimal. 

Proof: Let dMh{f) be the optimal diversity gain corre- 
sponding to a multiplexing gain re {0, . . . , niin(nT, npt)}. 
Using the same notation as j llj, we consider the constellation 
Ar n TZ, where the lattice A,. = p^^U^ is scaled according 
to the SNR, and 7?. is a fixed shaping region. Let S C 7?. be a 
ball of fixed radius R, where R is chosen in such a way that 
di +d2 e 7^, Vdi,d2 e B. Let 

1 2 
Vr = min — IIB dll . 

deenA, 4 

Then Lemma 1 of ifTHl holds, that is 



lim sup < 



p—yoo 



logp 



'ml 



(r). 



Let C > and choose such that ^ > 6* > 0. We have 
Ar = p~ Ar+(. As in the original proof, there exists pi such 
that for any p > pi, we have TZ C ^p~ B. As in Theorem 1 



from llllll . we want to show that the conditions 

2 



r+C 



>i, iin'ir</ 



(20) 



6 



are sufficient for the regularized 77-BDD solver to decode 
correctly for sufficiently large SNR. First of all, we establish 
a lower bound for the minimum squared norm 



4 



xeA,A{o} 4 



IRxIl' 



1 

mm — 

xeAA{o} 4 



IB'xll 



as follows. Let f(x.) = ||B'xj 
any lattice point. 



|x||^. Let xe Ar\ {0} be 



If X ^ ^p^B, then ^(x) > ||x||' > \R^p^ 



If X e 

(T 



and so i 

4 



B'xp- 



then 



> 1 since 



xp ^ e is n A.r 

by the hypothesis ( l20b . we have i^r+c > 1- Therefore we 
obtain ip{x) > \\B'xf > 4p^. 

In conclusion, there exists fc > such that > kp~^ . 

Now consider the transmitted codeword x £ O TZ. 
The regularized 77-BDD decoder is able to decode correctly 
provided that ||yi — Rx|| < r/dji. We have 



yi 



Rx||^ = ||y'-B'x||'- 



= n' 



< 



where c = maxrgK ||r|| is a constant. Therefore under 
the conditions ( |20] |. the regularized r/-BDD decoder is able 
to decode correctly provided that p + c < ■i]kp~^ . But 
6 < so there exists p such that for any p >p, we have 
p^ + c < rikp~^ . Then as in Theorem 1 from [11] we can 
conclude that 



-P{x,,_BDD 7^ x} < P{Vr+C < 1} 

The second term is negligible for p - 
similarly to the original proof, that 

logP{x^_BDD 7^ x} 

limsup < — (Xml 

p-^oo log p 

and then use the right continuity of duhir). 



P{\\n'\\->p'}. 
00. So we can say. 



(r + C) 



□ 



IV. Decoding Radius of Embedding 

In this section we will review Kannan's embedding tech- 
nique 130], show that it provides a BDD-solver, and analyze 
its decoding radius. 

The principle of this technique is to embed the basis 
matrix B and the received vector y into a higher dimensional 
lattice. More precisely, we consider the following (m + 1) x 
(n + 1) basis matrix: 



B = 



B 

Olxjl 



-y 
t 



(21) 



where t > is a parameter to be determined, which we refer 
to as the embedding parameter The strategy is to reduce CVP 
to SVP in the following way. For a suitable choice of t and 
for sufficiently small noise norm, the vectors ±v with v = 
[(Bx — y)-'" t]^ are the shortest vectors in the lattice £(B). 
Thus an SVP algorithm will find v, and the message x can be 
recovered from the coordinates of this vector in the basis B: 



if V = B 



Bx' y 

t 



then X = x'. 



(22) 



The LLL algorithm was used in [18S to find the shortest 
vector in the lattice £(B), and the correct decoding radius 
was shown to be lower bounded by 

when the parameter t is set to 2^20"/^ niini<i<„ \ri^i\. 

In the following sections, we will derive improved bounds 
on the decoding radius. 

A. Reducing BDD to iiSVP 

In 13311 . it is proven that by choosing t = dist(y, B), 



the embedding technique allows one to reduce 1/ (27)-BDD 
to 7-uSVP. We show that one can achieve the same correct 
decoding radius by setting t ~ ^Ai(B), thus bypassing the 
assumption from [33^] that dist(y,B) is known. In Section IVl 
we will show how to use an estimate of Ai(B) to achieve 
almost the same decoding radius. 

Theorem 2 (Decoding Radius of Embedding): Applying 7- 
uSVP (7 > 1) to the extended lattice jTH with parameter t 
(0 < t < Ai(B)/7) guarantees a correct decoding radius 



-RiiSVP-Emb > \/— '^l(B) — 



(24) 



Setting t = ^Ai(B) maximizes this lower bound. This gives: 



^uSvp-Emb > ■;;~'^i(B) 
27 



(25) 



Remark 3: If the SVP itself is solved, then the correct 
decoding radius satisfies -Rusvp-Emb > ^Ai(B). This result im- 
plies that embedding is more powerful than lattice reduction- 
aided SIC-decoding, since the latter still exhibits a widening 
gap to iAi(B), which is at least polynomial in n for Korkin- 
Zolotarev reduction and for dual Korkin-Zolotarev reduction 



(which require to solve SVP instances) 012 1. 

Theorem |2] is a direct consequence of the following lemma. 

Lemma 2: Let B be the matrix defined in (I2TI 1. and let < 
t < Ai(B)/7, with 7 > 1. Suppose that 



||y-Bx|| < ^i\,{B)-t^. 

Then v = ^^^^ '^^ is a 7-unique shortest vector of £(B). 

Proof: Let w be an arbitrary nonzero vector in £(B). 
Any vector in £(B) that is not a multiple of v is of the form 



9V, 



with q e Z and w G £(B) \ 0. We will show that ||vi^'|| > 
7||v||. The norm of w' can be written as 



w — qn\\ 



where n = y — Bx. If \\qn\\ < Ai(B), using the triangular 
inequality, we have II w — gnj I > ||w||— q||n|| > Ai(B) — g||n||. 
Thus we have the lower bound 



w'll > V(Ai(B)-<z||n||)^ + (gir 



7 



Ai(B)2 - 2qXi{B) ||n|| + \\nf + qH^ 



\ 



n 




n 



> 



Xi{B)t 



(26) 



Inir +f2 

If ll^nll > Ai(B), we can also obtain the same bound because 

||w W > qt > — — |— > 



To prove that ||w'|| > 7 ||v||, it suffices to ensure that 



This is imphed by the assumption that 

||n||2 = ||Bx-yf <-Ai(B)-^2. 

7 



□ 



As the LLL algorithm can solve 7-uSVP with 7 = 02^ for 
the basis (|2T]) of dimension n + 1 (see Equation (O), one can 
obtain that if using LLL, the correct decoding radius satisfies 



SVP-Emb 



> 



1 



2a2 



Ai(B) 



(27) 



by choosing t = ^TrAi(B). This decoding radius improves 
the bound ^ from'fis'l 



However, it can still be improved. 
The reason is that the estimate 7 = is pessimistic for 7- 
uSVP. In fact, the quantity at is just the approximation factor 
for the Approximate SVP achieved by LLL. Any algorithm 
solving 7-SVP necessarily solves 7-uSVP, while the converse 
is not true. 

We now give two complementary approaches for improving 
the lower bound on i?uSVP-Emb obtained by Kannan's em- 
bedding based on LLL. In the first approach, described in 
Subsection IIV-BI we provide a new reduction from HSVP 
to uSVP. This implies improved lower bounds on the correct 
decoding radius for Kannan's embedding based on any HSVP 
solver 

For the second approach, we start by giving a new bound 
on the performance of LLL-SIC in Subsection III-FI which 
is of independent interest. We then follow the execution of 
LLL within Kannan's embedding to show that it performs at 
least as well as LLL-SIC (Subsection llV-Cb . which leads to an 
improved lower bound on the correct decoding radius achieved 
by Kannan's embedding with LLL. This bound is slightly 
better (by a small constant factor) than the bound obtained by 
instantiating the first approach with LLL. The first approach 
is more general, whereas the second approach gives further 
insight on the relationship between Kannan's embedding and 
LLL-SIC. 

B. Reducing uSVP to HSVP 

We describe a reduction from solving 7-uSVP to solving C- 
HSVP for 7 sa y/nC. We will illustrate the usefulness of 



this approach by considering several reduction algorithms 
solving C-HSVP with diverse time/quality trade-offs. 

Theorem 3 (Reduction from uSVP to HSVP): Suppose that 
the sequence {Ck} is such that (Cfe)'^/''^^^' increases with k. 
Then for any 7 > v^~i"(C„)"^'""^^ 7-uSVP reduces 
to C„-HSVP 

Proof: Assume we have access to a C„-HSVP ora- 
cle. Let C be an n-dimensional lattice such that A2(£) > 
\/7n-i(^n)"^'"~^'''^i('C)- Assume we are given a basis B 
of C We use the HSVP oracle in the following way: 

• Use the oracle on the dual lattice C* = £(B*), to find 
a short vector c^; G C* in the dual lattice; Compute 
the largest integer k such that c^; belongs to kC* and 
divide cl by k; Extend c| into a complete basis C* of C* . 
This can be done in polynomial time by considering the 
unimodular nxn matrix V such that (cj )* V is in Hermite 
Normal Form; the first n — 1 rows of complete the 
basis (see for example |!44j]. Section 4). 

• For i = 2,-- - ,71: Project the vectors c*,-- - , c* 
to the orthogonal complement of the space generated 
by cl, . . . ,c*_i. Let c*,...,c* be the projected vec- 
tors. Note that the determinant of the projected lattice 



£([c*, . . . , c*]) is equal to 
det{C*) 



det{C*) 



det([c*,...,c*_i]) U^rl^ fj 

where C* = Q*R* is the QR decomposition of C*. 
Apply the HSVP oracle again to find a short vector 



XkCk 



k — i 



in the projected lattice C{[c* , . . . , c* ]); Lift it to a vector 
< in £([<,...,<]) given by 

n 
k—i 

Then replace c* by v* and complete the dual basis. Since 
lifting doesn't affect the orthogonal projections nor the 
r*j =< QijV* >, the new basis satisfies 



and consequently 



n 



3 J 



(28) 



This property still holds at subsequent steps of the algo- 
rithm since the operation of extracting a short vector and 
lifting decreases r* ^, and so increases rij=i+i j- 
We claim that Ci in the primal basis C = (C*)* is the 

shortest vector v of >C(B). We prove this fact by contradiction. 

Suppose that Ci 7^ ±v, where ±v are the unique shortest 



8 



vectors of £. Note that Ci cannot be ±2v or other muhiples, 
since in that case C would not be a basis of C We may write 

where xi is an integer and fc is the largest i such that Xi is 
nonzero. 

Observe that if C = QR is the QR decomposition of C, 



fc-1 



1=1 



J=i+1 

Since the q^'s are orthogonal, we have: 

Ai > ||v|| > ||rfc,fcqA;|| = r-zcfc. 

Using the assumption that Ci ^ ±v, we have that fc > 1. 
This ensures that: 

A2 < Ai(/: [bi,...,bfc_i]). 

Indeed, the second minimum A2 must be no greater than 
the norm of the shortest nonzero vector in the sublattice 
spanned by {bi, . . . , bfe_i}, since these vectors are linearly 
independent with v. The fact that fc > 1 ensures that there 
are non-zero vectors in that lattice. Using Minkowski's first 
theorem, we obtain 

A2 < V7r:rdct(/:[bi,--- ,bfc_i])^ 

= n^vU (29) 



n 



\i—n — k^2 



where we used the relation r^.i = n-i+i- 
In the meantime, the HSVP oracle ( [28] l ensures that 



(30) 



(31) 



for any fc > 1. Substituting into ( |29] l, we have 

A2 < V7'c-i(Cfe)''"'''fc,fc 

< v/7;:^(c^n)^Ai, 

where we used the (mild) assumption that (Ck)^l^^~^^ in- 
creases with k. The last statement is a contradiction because 
we assumed A2 > ^7„-i(Cn)"^'"~"^^Ai. This completes the 
proof. □ 

We now instantiate Theorems |2] and |3] with two different 
HSVP solvers. 

The LLL algorithm solves C„-HSVP with C„ = a^""^)/^ 
(see Equation (fTTT i). The sequence (C„)"/'-"~^' grows with n, 
and thus, by Theorem |3] LLL solves any n-dimensional 
instances of 7-uSVP with 7 = ^7„_]^a"/^. Note that in the 



reduction from uS VP to HSVP (in the proof of Theorem |3), a 
single LLL reduction suffices, even if the reduction calls the 
HSVP oracle many times on projections of the dual lattice. 
This is because LLL is almost self-dual and the projected 
sublattices are also reduced. More precisely: We call a basis 
effectively LLL-reduced if it satisfies condition (O for j — i—\ 
(and possibly not for < i — 1) and if it satisfies the Lovasz 
condition (|9]). A basis that is effectively LLL-reduced also 
satisfies Equations (fTOb . (fTTl i and (fT2] i. The LLL algorithm is 
self-dual in the sense that if a basis is effectively LLL-reduced, 
so is its dual basis (see 14511 ). Moreover, if [bi,...,b„] 
is LLL-reduced, the projection of the basis [bi,...,b„] on 
the orthogonal complement of the vector space generated by 
bi, . . . , bi_i is also LLL-reduced. 

Overall, if one relies on LLL as HSVP solver, we ob- 
tain that Kannan's embedding achieves correct decoding ra- 
dius > — , ^, — ttttAi, when using embedding parameter 

i = 1 



7ti ^ 



Ai. 



The BKZ algorithm m solves C-HSVP with a smaller C, 
but at the cost of a higher run-time. It is parametrized by 
a block-size /3 e [2,n]. In li46ll , a variant of BKZ is given, 
which achieves C„_^ = 2(7^)2(13-1)^2 \^ ivcae. polynomial 
in n and 2*^. For a fixed value of /3 (and even for a block- 
size /? that is growing slowly with respect to n), the se- 
quence (C„,^)"/("-i) grows with n, and thus, by Theorem |3] 
the modified BKZ with block-size /3 solves any n-dimensional 
instances of 7-uSVP with 7 = V7^(7fl) 



2(/3-1)^2(ti-1) 



C. Embedding based on LLL is at least as good as LLL-SIC 

The analysis in Subsection IIV-BI holds generally for any 
HSVP solver. In this section we focus on the LLL algorithm, 
and prove a stronger bound, namely that embedding based on 
LLL has a decoding radius at least as large as LLL-SIC's. The 
key observation is as follows: If y falls within the decoding 
radius of Babai, the vector [(Bx — y)-^ t]"^ will be the 
shortest vector; it will be moved by LLL to the first column of 
the basis, and will stay there during the rest of the execution 
of LLL. 

Lemma 3 (Embedding is at least as good as LLL-SIC): 
Consider a fixed realization y = Bx + n of the MIMO 
system (|4]i. Suppose that ||nj| < i?LLL-sic, so that the LLL- 
SIC decoder returns the correct transmitted vector x. Then 
the embedding technique (based on LLL) with the choice 
t = i?LLL SIC also outputs the correct transmitted vector for 
the same MIMO system. Consequently, the correct decoding 
radius of the embedding technique is greater or equal to 
-Rlll-sic- 

Proof: Let B"''^ = BU be the LLL-reduced channel 
matrix. 

If B'^'i = QR is the QR decomposition of B"''^, then 
the output of LLL-SIC is given by Ux, where x is defined 
recursively by 



< q»,y > ^ 



9 



Suppose that the noise vector n is shorter than the correct 
decoding radius of LLL-SIC, that is 

1 



nil < Rlll 



SIC 



— mm 

2 l<i<n 



t. 



(32) 



Observe that the hypothesis of Lemma|2]is satisfied for 7 = 
1, that is 



-SIC —t> ||n|| 



(33) 



that 



since Ai(B) > 2i?LLL-sic- Consequently, Lemma |2] implies 

is a unique shortest vector in the extended lattice. 

Consider an alternate version of LLL reduction in which 
a full round of size reductions RED(fc,i), i = fc — 1, . . . , 1 
is performed before the Lovasz test, i.e., when considering 
vector bfc, the LLL variant ensures that condition (|8) is 
satisfied for all \ri^k \ / (with varying i) before checking 
condition (|9]l. Since size reduction has no impact on the Lovasz 
test, this version leads to the same output as the usual LLL 
algorithm lll4ll . After LLL-reducing the first n columns, the 
augmented channel matrix is of the form 

grod _ 



^ ' t 

By doing a first round of size reduction RED(n + 
i = n, . . . , 1 on the last column, we find that the (n + l)-th 

column bn+i = ^ as size-reduction is exactly SIC. At 

this stage, we have that augmented matrix is of the form 

urcd Krcd 
"1 ■ ■ ■ "n 











n 

t 



We will prove by induction that for all the subsequent steps 
indexed by fc = n, . . . , 1, the Lovasz condition on the columns 
fc + 1 and fc fails, and there is a swap, so that at step fc the 
augmented matrix is of the form 

^,rod ... ^,rcd ^ ... ^ 

... t * • • • * 



The inductive step works as follows. Let B^*^' 
the QR decomposition of B'*^'. Then 



y fc+l,fc+l J ^ \ k.k+l J 



< 



< 



'k+1 



Q(fc)R(fc) be 



\\nf+t' 



since the columns of R'^'^^ are projections of the corresponding 
columns of B^'^'^ All the swaps will take place since, because 
of condition ( |32] |. 

1 



nll^ <- min rf, 

2 l<l<n ' 



After the last swap, we obtain 
BW = 



< -r 



k.k- 



t 



cannot occur as b2 7^ and bi is a shortest non-zero lattice 
vector. 



Thanks to condition ( |33] |. the vector 



t 



is a shortest 



non-zero vector of the augmented lattice. So it is not swapped 
during the subsequent steps of the execution of LLL, and thus 
it is the first column of the output basis B'^"'^. 

To conclude, we have proven that with the choice t — 
i?LLL-sic, the correct decoding radius of embedding is greater 
than i?LLL-sic- □ 

Remark 4: The proposed value t ~ i?LLL sic of the embed- 
ding parameter can be efficiently computed after B''°'^ is found 
and before reducing the (n + l)-th column of B. 

V. Dealing With Ai 

The derived bounds on the correct decoding radius hold 
only if the minimum distance Ai is known. However, Ai can 
only be obtained by solving SVP, which is generally a difficult 
problem. Fortunately, there are alternative approaches that do 
not require the knowledge of the exact value of Ai. 



A. Rigorous approach 

Suppose we do not know Ai, but that we have a good 
estimate of it: Ai G [A, kA] for some factor k > 1. Let 



27K ' 27 



The assumption of Theorem|2]is satisfied. 
Observe that the right hand side of (l24l l is an non-decreasing 
function of t in this interval. Then the correct decoding radius 
is 



Emb 



> 



> 



> 





t2 


Ai / 1 


1 


7 V 2k 






1 


7 V 2k 


4k 


Ai 




2n/^7' 





(35) 



Equation ( [35] l shows that for any approximation constant k, 
we at most lose only a constant \/k in the correct decoding 
radius. 

We recall the following useful property of LLL-reduced B 
which follows from (fTSl i: 



-(«-l)/2| 



bill < Ai < llbil 



(34) Letting A = a 



-(«-l)/2| 



bill, we obtain 



A < Ai < a("-i'/2^. 



(36) 



(37) 



Substituting k = a^"^^^^^^ into (1351 ) and choosing 7 = 
^/j^a^i~ as in Subsection IIV-BI for {n + 1) -dimensional 
lattices, we can obtain a decoding radius 



R > 



Ai 



Now, recall that if the first column bi of a basis matrix is 
a shortest lattice vector, then it remains at the first position 
during the whole execution of LLL. Indeed, it is never 
swapped. To see it, recall that the swap between the first and 
the second columns takes place only if ||b2|p < ^||bi|p. This by setting the embedding parameter t to ^. 



10 



It is possible to obtain a better guarantee on the correct 
decoding radius by partitioning the interval where Ai resides; 



r- 



-1 logc 
! logf 



(38) 



where k > 1 is arbitrary. Each subinterval is of the form 
[Ai,AiK] with Ai = K^A. We apply the embedding technique 
for each subinterval, choosing 

A^ 

U = — . 

27 

Each call solves 7-uSVP with 7 = ^J^a~^ as in Subsec- 
tion IIV-BI for {n + 1) -dimensional lattices; at least one of 
these subintervals contains and therefore the correspond- 
ing call provides the closest lattice vector to the target as 
long as the norm of the noise is less than t^^. Therefore, 
using r^^iiff 1 calls to LLL, we can solve 7'-BDD with 
7' = v^7- That is, we only lose a factor n, compared to the 
case when Ai is known. Note that n can be chosen arbitrarily 
close to 1, at the cost of increasing the number of calls to 
LLL. 



B. Heuristic approach 

We may also find a good estimate of Ai heuristically. 

There is a common belief that the worst-case bounds (flU 
and ( fT2] i are not tight for LLL reduction on average. In low 
dimensions, the LLL algorithm often finds the successive 
minimum vectors in a lattice. In 14711 . the average behavior of 
LLL reduction for some input distributions was numerically 
assessed, and it was observed that one should replace the factor 
a~2- from ( fT2] i by a much smaller value for a random lattice 
of sufficiently high dimension. The experiments corresponding 
to Fig. [T] allows one to observe a similar behavior for random 
basis matrices with i.i.d. Gaussian entries: For 6 = 0.99, 
the factor q;^~ sa (1.428)"~^ from (fT2l i should be replaced 
by « 1.01". 

Independently, we have the upper bound Ai < 
■yTnO!^^ inini<i<„ jr; i| (from Lemma[T]and Equation (fTSll), 
where the |rii|'s can be easily computed from the out- 
put basis. For 5 = 0.99, this approximately gives Ai < 
V7;:(1.195)"-imini<,;<„|r,,,|. 

Fig. 12] shows that after the call to LLL with 5 ~ 0.99 and 
for random input basis matrices with i.i.d. Gaussian entries, 
we have Ai ss 1.03" mini<j;<„ \ri^i\. 

It is also folklore to estimate Ai via the so-called Gaussian 
heuristic 1148 1 



Ai 



r(l + n/2)i/'^ 



(det/:)iA 



n 
27re' 



This estimate of Ai is the radius of the ball whose volume 
matches the lattice determinant. The Gaussian heuristic holds 
for random lattices in a certain sense, and can be made 
rigorous for precise definitions of random lattices (derived 
from the theory of Haar measures on classical groups) 1147 1. 
However, the experiments in Fig. |3] tend to show that this 
estimate does not apply for lattices sampled by i.i.d. Gaussian 
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Figure 1 . Experimental value of ^ in Fj with Fi = , as function of 
dimension n for the ouptut of LLL with S = 0.99 and i.i.d. Gaussian inputs. 
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Figure 2. Experimental value of — logF2 with F2 = — — 1 r, as 

function of dimension n for the output of LLL with S = and i.i.d. 

Gaussian inputs. 



matrices: the minimum Ai seems to follow the Gaussian 
heuristic in the beginning, but to fall short of the theoretic 
value when n is large. 

VI. Experiments 

In this Section we address the practical implementation 
of embedding decoding in implementation and compare its 
performance with those of existing methods. 

A. Incremental Reduction for Embedding 

Setting <o = ^/(27) and n ~ a^l"^, we give an efficient 
implementation of the strategy proposed in Section IV-AI where 
n — 1 calls to LLL reduction of the extended matrix (ISTT i are 
performed for the sequence {t,} of values of t given in equa- 
tion (|38] |. It is summarized by the pseudocode of the function 
IncrEmb(B,y, to) in Table I. Except the first one, each call to 
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Figure 3. Experimental value of with F3 — ^ c)'^lr. 
dimension n, for bases with i.i.d. Gaussian inputs. The straight line is the 
theoretic value of the Gaussian heuristic. 

Table I 

Pseudocode of incremental embedding decoding 



1: B 

2: 
3: 
4: 
5: 

6: B 



Function IncrEmb(B,y, to 
B -y 
Oixn ta 
for i = 1 to n — 1 do 
B =BU < — LLL(^B 

u' = u'u 

X, ^ U'(l,:) 

InXn Oixn 

^ Oixn 
7: end for 

8: X A — arg min |ly — Bx^ 
9: return x 



U 



LLL is significantly cheaper, as LLL is called on a reduced 
matrix whose last row has been multiplied by a small constant 
factor a^l"^ (Line 6). This is equivalent to jiultiplying ti by 
a^/^, then reducing the extended matrix B. Intuitively, we 
deform the lattice progressively while preserving the property 
of being LLL-reduced. At last, we choose the vector that is 
closest to y. 

B. List Decoding Based on Embedding 

A practical way to improve the embedding technique is 
to make use of all intermediate lattice vectors during the 
execution of LLL. Such vectors are generated when size 
reduction is performed. Since the number of iterations is 



between O \ 



and O I 



in embedding (see l\m ). and 



since we can obtain one new vector in each size reduction, the 
list size can range from O (n^) to O {n^Y We can integrate 
this into LLL, and the complexity will be of the same order 
The size check in LLL is done with respect to the \ri,i\. 
Clearly it is preferable to choose small t in order to make 



sure that the last column in (ISTT i can be used as many times 
as possible. Here, we choose 

1 . , , 



2^, 



(39) 



na 4 ^i'i' 



which is indeed far smaller than the average-case. 



C. Soft-output Decoding Based on Embedding 

Soft output is also possible from the constellation points 
generated in the size reduction. To further improve the per- 
formance, near neighbors of the recovered constellation point 
are also taken into consideration. Once the list is found, 
we choose to center it on y, and then pick up the K best 
candidates with the smallest Euclidean norm. The K candidate 
vectors Z = {zi, • • • , za'} can be used to approximate the log- 
likelihood ratio (LLR), as in HI. For bit bi G {0, 1}, the 
approximated LLR is computed as 



LLR{h\y)^\og 



BzlP 



Ez62:6,(z)=oexp (-^lly - Bz||2) 



(40) 

where hi (z) is the i-th information bit associated with the 
sample z. The notation z : bi (z) = fj, means the set of all 
vectors z for which bi (z) = fi. 

D. Simulation Results 

This subsection examines the error performance of the 
embedding technique. For comparison purposes, the perfor- 
mances of lattice reduction aided SIC and ML decoding are 
also shown. We assume perfect channel state information at 
the receiver, and use MMSE-GDFE left preprocessing for 
the suboptimal decoders. Monte Carlo simulation was used 
to estimate the bit error rate with Gray mapping and LLL 
reduction ((5=0.75). 

Fig. m shows the bit error rate for an uncoded MIMO 
system with nr ~ nji = 10, 64-QAM. We found that the Ust 
and incremental versions of embedding achieve near-optimum 
performance in this setting; the SNR loss is about 1 dB. Both 
of them are better than ALR llisll and embedding using the 
exact knowledge of Ai ("exact MMSE embedding"). We also 
observed poor performance for the choice of the embedding 
parameter t = dist(y, B) in 133 1. 

Fig.|5]shows the achieved performance of embedding decod- 
ing for the 4x4 Perfect code using 64-QAM. The decoding 
lattices are of dimension 16 in the complex space (and 32 
in the real space). The list version of embedding enjoys 3.5 
dB gain over LLL-SIC, while embedding using the average 
estimate of Ai in Section V-B ("average MMSE embedding") 
also has more than 2 dB gain. 

Fig. |6] compares the average complexity of LLL-SIC de- 
coding, embedding decoding and sphere decoding for uncoded 
MIMO systems using 64-QAM. 

Fig. I2] shows the frame error rate for a coded 4x4 MIMO 
system with 4-QAM. For channel coding, we use a rate- 
1/2, irregular (256, 128,3) low-density parity-check (LDPC) 
code of codeword length 256 (i.e., 128 information bits) llsoll . 
Each codeword spans one channel realization. The parity 
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Figure 4. Bit eiTor rate vs. average SNR per bit for the uncoded 10 X 10 
system using 64-QAM. 



» LLLMMSESIC 
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10 15 20 25 



E,/N„(dB) 

Figure 5. Bit error rate vs. average SNR per bit for the 4x4 perfect code 
using 64-QAM. 

check matrix is randomly constructed, but cycles of length 4 
are eliminated. The maximum number of decoding iterations 
is set at 50 for the LDPC. It is seen that the soft-output 
version of embedding decoding is also nearly optimal when 
K = 20, with a performance very close to maximum a 
posterior probability (MAP) decoding and much better than a 
MMSE-only detector followed by per symbol LLR calculation. 

VII. Conclusions AND Discussion 

In this paper, we have studied the embedding technique 
from a BDD point of view. We have investigated the relation 
between Hermite SVP and uSVP and improved a previously 
known bound on the value 7 for which LLL reduction provides 
a solution to 7-uSVP. Moreover, we proved that BDD is DMT- 
optimal. The polynomial complexity and near-optimum perfor- 




Dimension 



Figure 6. Average number of floating-point operations for uncoded MIMO 
at average SNR per bit = 17 dB. Dimension n = 2nr = 2nii 




E,/N„(dB) 



Figure 7. Frame error rate vs. average SNR per bit for the 4x4 rate- 1/2 
LDPC code of codeword length 256 using 4-QAM. 

mance of the embedding technique makes it very attractive in 
decoding applications. 

We proposed variants with different embedding parameters t 
that are easy to compute and do not require the knowledge of 
the minimum distance Ai of the lattice: a rigorous version for 
which we can provide a theoretical estimate of the decoding 
radius, a heuristic version based on a heuristic estimate of Ai 
with lower computational complexity, and a list-based embed- 
ding scheme with improved BER performance. Our numerical 
simulations provide evidence that a significant fraction of the 
gap to ML decoding can be recovered. 

We have proven that the correct decoding radius achieved 
by the LLL-based embedding technique is at least as large as 
the one achieved by LLL-SIC. Experimentally, it seems that it 
is in fact strictly larger It would be interesting to explain why 
this is indeed the case and to which extent. One possibility 
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would be that the embedding technique benefits on average 
from the noise vector following a normal distribution. 
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